{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "## 损失函数与优化器"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "---"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### 介绍"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "在上一个实验中，我们初步完成了梯度下降算法求解线性回归问题的实例。在这个过程中，我们自己定义了损失函数和权重的更新，其实 PyTorch 也为我们直接定义了相应的工具包，使我们能够简洁快速的实现损失函数、权重的更新和梯度的求解。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### 知识点"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "- 损失函数的定义\n",
    "- 优化器的定义\n",
    "- 模型的训练步骤"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "---"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 模型的训练"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### 损失函数"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "在上一个实验中，我们利用了自己定义的损失函数对线性问题进行了求解。其实 `torch.nn` 中存在很多封装好的损失函数。比如均方差损失，用 `torch.nn.MSELoss()` 表示。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "首先，还是让我们先把所需要的变量和数据集定义出来："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "source": [
    "import torch\r\n",
    "import torch.nn as nn\r\n",
    "\r\n",
    "# 初始化数据集\r\n",
    "X = torch.tensor([1, 2, 3, 4], dtype=torch.float32)\r\n",
    "Y = torch.tensor([2, 4, 6, 8], dtype=torch.float32)\r\n",
    "w = torch.tensor(0.0, dtype=torch.float32, requires_grad=True)\r\n",
    "\r\n",
    "\r\n",
    "def forward(x):\r\n",
    "    # 正向传播函数\r\n",
    "    return w * x\r\n",
    "\r\n",
    "\r\n",
    "# 测试代码\r\n",
    "pre = forward(X)\r\n",
    "pre"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "tensor([0., 0., 0., 0.], grad_fn=<MulBackward0>)"
      ]
     },
     "metadata": {},
     "execution_count": 1
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "source": [
    "import torch\r\n",
    "from torch import nn\r\n",
    "# 初始化数据集\r\n",
    "X = torch.tensor([1,2,3,4],dtype=torch.float32)\r\n",
    "Y = torch.tensor([2,4,6,8],dtype=torch.float32)\r\n",
    "w = torch.tensor(0.0,dtype=torch.float32,requires_grad=True)\r\n",
    "# 定义前向传播函数\r\n",
    "def forward(x):\r\n",
    "    return x*w\r\n",
    "print(forward(X))"
   ],
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "tensor([0., 0., 0., 0.], grad_fn=<MulBackward0>)\n"
     ]
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "接下来，让我们通过 `nn.MSELoss()` 计算此时预测值和真实值之间的损失："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "# 这里使用均方差损失计算预测值和真实值之间的距离\r\n",
    "loss = nn.MSELoss()\r\n",
    "# 测试此时的损失\r\n",
    "loss(forward(X), Y)"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "source": [
    "# 使用MSE来定义损失函数\r\n",
    "loss = nn.MSELoss()\r\n",
    "# 测试损失\r\n",
    "loss(forward(X),Y)"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "tensor(30., grad_fn=<MseLossBackward>)"
      ]
     },
     "metadata": {},
     "execution_count": 3
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### 优化器"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "优化器可以理解为一种利用梯度下降算法自动求解所需参数的工具包。在 PyTorch 中提供了 `torch.optim` 方法优化我们的模型。 `torch.optim` 工具包中存在着各种梯度下降的改进算法，比如 SGD、Momentum、RMSProp 和 Adam 等。这些算法都是以传统梯度下降算法为基础，提出的改进算法，这些算法可以更快更准确地求解最佳模型参数。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "我们可以通过下面方式定义一个 SGD 优化器："
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "    optimizer = torch.optim.SGD([w], lr=learning_rate)"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "其中第一个参数，表示的是损失函数中的权重，即我们需要求取的值。lr 表示的是梯度下降的步长。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "由于一般的模型都是复杂的多元函数，每次使用梯度下降算法时，我们都需要手动的对每个变量进行更新，这无疑是非常繁琐的。而使用优化器，我们可以一次性对所有的变量进行更新。函数如下："
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "- ` optimizer.step()`  ：对模型（神经网络）中的参数进行更新，即所有参数值向梯度相反方向走一步。\n",
    "-  `optimizer.zero_grad()` ：对损失函数的相关变量进行梯度的清空。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "综上，让我们完整的进行一次线性回归的求解。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "首先，定义损失函数和优化器："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "# 定义损失和优化器\r\n",
    "learning_rate = 0.01\r\n",
    "n_iters = 100\r\n",
    "loss = nn.MSELoss()\r\n",
    "optimizer = torch.optim.SGD([w], lr=learning_rate)\r\n",
    "optimizer"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "source": [
    "# 定义损失和优化器\r\n",
    "learning_rate = 0.01\r\n",
    "n_iters = 100\r\n",
    "loss = nn.MSELoss()\r\n",
    "optimizer = torch.optim.SGD([w],lr=learning_rate)\r\n",
    "optimizer"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "SGD (\n",
       "Parameter Group 0\n",
       "    dampening: 0\n",
       "    lr: 0.01\n",
       "    momentum: 0\n",
       "    nesterov: False\n",
       "    weight_decay: 0\n",
       ")"
      ]
     },
     "metadata": {},
     "execution_count": 4
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "接下来，根据正向传播结果，更新梯度，进而更新权重值："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "source": [
    "# 模型的训练过程\r\n",
    "for epoch in range(n_iters):\r\n",
    "    y_predicted = forward(X)\r\n",
    "    # 计算损失\r\n",
    "    l = loss(Y, y_predicted)\r\n",
    "    # 计算梯度\r\n",
    "    l.backward()\r\n",
    "    # 更新权重，即向梯度方向走一步\r\n",
    "    optimizer.step()\r\n",
    "    # 清空梯度\r\n",
    "    optimizer.zero_grad()\r\n",
    "\r\n",
    "    if epoch % 10 == 0:\r\n",
    "        print('epoch ', epoch+1, ': w = ', w, ' loss = ', l)\r\n",
    "\r\n",
    "print(f'根据训练模型预测，当 x =5 时，y 的值为： {forward(5):.3f}')"
   ],
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "epoch  1 : w =  tensor(0.3000, requires_grad=True)  loss =  tensor(30., grad_fn=<MseLossBackward>)\n",
      "epoch  11 : w =  tensor(1.6653, requires_grad=True)  loss =  tensor(1.1628, grad_fn=<MseLossBackward>)\n",
      "epoch  21 : w =  tensor(1.9341, requires_grad=True)  loss =  tensor(0.0451, grad_fn=<MseLossBackward>)\n",
      "epoch  31 : w =  tensor(1.9870, requires_grad=True)  loss =  tensor(0.0017, grad_fn=<MseLossBackward>)\n",
      "epoch  41 : w =  tensor(1.9974, requires_grad=True)  loss =  tensor(6.7705e-05, grad_fn=<MseLossBackward>)\n",
      "epoch  51 : w =  tensor(1.9995, requires_grad=True)  loss =  tensor(2.6244e-06, grad_fn=<MseLossBackward>)\n",
      "epoch  61 : w =  tensor(1.9999, requires_grad=True)  loss =  tensor(1.0176e-07, grad_fn=<MseLossBackward>)\n",
      "epoch  71 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(3.9742e-09, grad_fn=<MseLossBackward>)\n",
      "epoch  81 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(1.4670e-10, grad_fn=<MseLossBackward>)\n",
      "epoch  91 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(5.0768e-12, grad_fn=<MseLossBackward>)\n",
      "根据训练模型预测，当 x =5 时，y 的值为： 10.000\n"
     ]
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "source": [
    "# 进行模型训练\r\n",
    "for epoch in range(n_iters):\r\n",
    "    # 前向传播\r\n",
    "    y_predicted = forward(X)\r\n",
    "    # 计算损失\r\n",
    "    l = loss(y_predicted,Y)\r\n",
    "    # 计算梯度\r\n",
    "    l.backward()\r\n",
    "    # 更新权重\r\n",
    "    optimizer.step()\r\n",
    "    # 清空梯度\r\n",
    "    optimizer.zero_grad()\r\n",
    "\r\n",
    "    if epoch % 10 == 0:\r\n",
    "        print('epoch ', epoch+1, ': w = ', w, ' loss = ', l)\r\n",
    "\r\n",
    "print(f'根据训练模型预测，当 x =5 时，y 的值为： {forward(5):.3f}')\r\n",
    "    "
   ],
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "epoch  1 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  11 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  21 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  31 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  41 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  51 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  61 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  71 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  81 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "epoch  91 : w =  tensor(2.0000, requires_grad=True)  loss =  tensor(8.9884e-13, grad_fn=<MseLossBackward>)\n",
      "根据训练模型预测，当 x =5 时，y 的值为： 10.000\n"
     ]
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "这里我们进行了 100 次的迭代，可以发现得到的权重 w 和实际值相同，损失无限接近于 0 。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### 模型的建立"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "除了梯度的求解、权重的更新和梯度的清空外，PyTorch 实际上还为我们提供了模型的定义。也就是说，我们不用手动定义 `forward` 函数了。PyTorch 中为我们提供了预定义模型，可以直接使用。如 `torch.nn.Linear(input_size, output_size)` 表示线性函数模型。\n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "- input_size：输入数据的维度\n",
    "- output_size：输出数据的维度"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "总结一下，我们可以将一个线性问题的求解分为下面三个步骤："
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "1. 定义模型（即正向传播函数）。\n",
    "2. 定义损失和优化器。\n",
    "3. 模型的训练（正向传播、反向传播、更新梯度、梯度下降、循环）。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "首先，让我们利用 PyTorch 定义线性函数模型：\n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "# 由于使用 PyTorch ，因此所有的变量都为张量\r\n",
    "X = torch.tensor([[1], [2], [3], [4]], dtype=torch.float32)\r\n",
    "Y = torch.tensor([[2], [4], [6], [8]], dtype=torch.float32)\r\n",
    "X_test = torch.tensor([5], dtype=torch.float32)\r\n",
    "\r\n",
    "# 1. 定义模型\r\n",
    "n_samples, n_features = X.shape\r\n",
    "# 这里输入和输出的维度相同\r\n",
    "model = nn.Linear(n_features, n_features)\r\n",
    "model"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "source": [
    "# 由于使用 PyTorch ，因此所有的变量都为张量\r\n",
    "# 因为pytorch必须包含样本轴,所以需要把每个样本都单独拿出来,\r\n",
    "# 如果是这样x=[1,2,3,4],y=[2, 4, 6, 8]这样实际上是一个样本\r\n",
    "X = torch.tensor([[1], [2], [3], [4]], dtype=torch.float32)\r\n",
    "Y = torch.tensor([[2], [4], [6], [8]], dtype=torch.float32)\r\n",
    "X_test = torch.tensor([5], dtype=torch.float32)\r\n",
    "\r\n",
    "# 1. 定义模型\r\n",
    "# 第一个维度肯定是样本轴\r\n",
    "n_samples, n_features = X.shape\r\n",
    "print(X.shape)\r\n",
    "# 这里输入和输出的维度相同\r\n",
    "model = nn.Linear(in_features =n_features, out_features=n_features,bias=True)\r\n",
    "model"
   ],
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "torch.Size([4, 1])\n"
     ]
    },
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "Linear(in_features=1, out_features=1, bias=True)"
      ]
     },
     "metadata": {},
     "execution_count": 18
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    " 在模型训练时，我们可以直接利用 `model(x)` 作为模型的正向传播，该函数返回数据 x 的预测结果。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "接下来，让我们定义优化器和损失函数："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "source": [
    "# 2. 定义优化器和损失函数\r\n",
    "learning_rate = 0.1\r\n",
    "n_iters = 100\r\n",
    "\r\n",
    "loss = nn.MSELoss()\r\n",
    "# 在定义优化器时，直接利用 model.parameters() 表示模型中所有需要求的权重\r\n",
    "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)\r\n",
    "optimizer"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "SGD (\n",
       "Parameter Group 0\n",
       "    dampening: 0\n",
       "    lr: 0.1\n",
       "    momentum: 0\n",
       "    nesterov: False\n",
       "    weight_decay: 0\n",
       ")"
      ]
     },
     "metadata": {},
     "execution_count": 19
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "source": [
    "# 2. 定义优化器和损失函数\r\n",
    "learning_rate = 0.1\r\n",
    "n_iters = 100\r\n",
    "\r\n",
    "loss = nn.MSELoss()\r\n",
    "# 在定义优化器时，直接利用 model.parameters() 表示模型中所有需要求的权重\r\n",
    "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)\r\n",
    "optimizer"
   ],
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "SGD (\n",
       "Parameter Group 0\n",
       "    dampening: 0\n",
       "    lr: 0.1\n",
       "    momentum: 0\n",
       "    nesterov: False\n",
       "    weight_decay: 0\n",
       ")"
      ]
     },
     "metadata": {},
     "execution_count": 20
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "最后，我们就可以利用上面定义的模型、优化器和损失函数进行模型的训练了（即利用梯度下降算法，求解损失最小时的权重值）："
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "source": [
    "# 3. 模型的训练，固定的步骤：正向传播、计算损失、反向传播、更新权重、梯度清空\r\n",
    "for epoch in range(n_iters):\r\n",
    "    # 正向传播\r\n",
    "    y_predicted = model(X)\r\n",
    "    # 损失\r\n",
    "    l = loss(Y, y_predicted)\r\n",
    "    # 反向传播\r\n",
    "    l.backward()\r\n",
    "    # 更新权重\r\n",
    "    optimizer.step()\r\n",
    "    # 清空梯度\r\n",
    "    optimizer.zero_grad()\r\n",
    "\r\n",
    "    if epoch % 10 == 0:\r\n",
    "        [w, b] = model.parameters()  # unpack parameters\r\n",
    "        print('epoch ', epoch+1, ': w = ', w[0][0].item(), ' loss = ', l)\r\n",
    "\r\n",
    "print(f'根据训练模型预测，当 x =5 时，y 的值为：', forward(X_test))"
   ],
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "epoch  1 : w =  2.585728168487549  loss =  tensor(8.7596, grad_fn=<MseLossBackward>)\n",
      "epoch  11 : w =  1.9647918939590454  loss =  tensor(0.0063, grad_fn=<MseLossBackward>)\n",
      "epoch  21 : w =  1.9654924869537354  loss =  tensor(0.0019, grad_fn=<MseLossBackward>)\n",
      "epoch  31 : w =  1.9743818044662476  loss =  tensor(0.0010, grad_fn=<MseLossBackward>)\n",
      "epoch  41 : w =  1.9810937643051147  loss =  tensor(0.0005, grad_fn=<MseLossBackward>)\n",
      "epoch  51 : w =  1.9860492944717407  loss =  tensor(0.0003, grad_fn=<MseLossBackward>)\n",
      "epoch  61 : w =  1.9897059202194214  loss =  tensor(0.0002, grad_fn=<MseLossBackward>)\n",
      "epoch  71 : w =  1.9924041032791138  loss =  tensor(8.8518e-05, grad_fn=<MseLossBackward>)\n",
      "epoch  81 : w =  1.9943950176239014  loss =  tensor(4.8197e-05, grad_fn=<MseLossBackward>)\n",
      "epoch  91 : w =  1.9958642721176147  loss =  tensor(2.6242e-05, grad_fn=<MseLossBackward>)\n",
      "根据训练模型预测，当 x =5 时，y 的值为： tensor([[9.9843]], grad_fn=<MulBackward0>)\n"
     ]
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "source": [
    "# 3. 模型的训练，固定的步骤：正向传播、计算损失、反向传播、更新权重、梯度清空\r\n",
    "for epoch in range(n_iters):\r\n",
    "    # 正向传播\r\n",
    "    y_predicted = model(X)\r\n",
    "    # 计算损失\r\n",
    "    l = loss(Y, y_predicted)\r\n",
    "    # 反向传播\r\n",
    "    l.backward()\r\n",
    "    # 更新权重\r\n",
    "    optimizer.step()\r\n",
    "    # 清空梯度\r\n",
    "    optimizer.zero_grad()\r\n",
    "\r\n",
    "    if epoch % 10 == 0:\r\n",
    "        [w, b] = model.parameters()  # unpack parameters\r\n",
    "        print('epoch ', epoch+1, ': w = ', w[0][0].item(), ' loss = ', l)\r\n",
    "\r\n",
    "print(f'根据训练模型预测，当 x =5 时，y 的值为：', forward(X_test))"
   ],
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "epoch  1 : w =  1.9969481229782104  loss =  tensor(1.4287e-05, grad_fn=<MseLossBackward>)\n",
      "epoch  11 : w =  1.9977481365203857  loss =  tensor(7.7800e-06, grad_fn=<MseLossBackward>)\n",
      "epoch  21 : w =  1.998338222503662  loss =  tensor(4.2358e-06, grad_fn=<MseLossBackward>)\n",
      "epoch  31 : w =  1.9987739324569702  loss =  tensor(2.3065e-06, grad_fn=<MseLossBackward>)\n",
      "epoch  41 : w =  1.9990952014923096  loss =  tensor(1.2557e-06, grad_fn=<MseLossBackward>)\n",
      "epoch  51 : w =  1.9993324279785156  loss =  tensor(6.8382e-07, grad_fn=<MseLossBackward>)\n",
      "epoch  61 : w =  1.9995074272155762  loss =  tensor(3.7227e-07, grad_fn=<MseLossBackward>)\n",
      "epoch  71 : w =  1.9996366500854492  loss =  tensor(2.0265e-07, grad_fn=<MseLossBackward>)\n",
      "epoch  81 : w =  1.9997316598892212  loss =  tensor(1.1037e-07, grad_fn=<MseLossBackward>)\n",
      "epoch  91 : w =  1.9998021125793457  loss =  tensor(6.0100e-08, grad_fn=<MseLossBackward>)\n",
      "根据训练模型预测，当 x =5 时，y 的值为： tensor([[9.9992]], grad_fn=<MulBackward0>)\n"
     ]
    }
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "可以看到其实模型训练的步骤是固定的:"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "1. 利用 `nn.Linear` 定义模型。\n",
    "2. 利用 `nn.MSELoss` 定义损失。\n",
    "3. 利用 `torch.optim` 定义优化器。\n",
    "4. 利用梯度下降算法进行模型的训练。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "并且模型的训练步骤也是固定的："
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "1. 利用 `model(X)` 进行正向传播。\n",
    "2. 利用 `loss(Y, y_predicted)` 计算模型损失。\n",
    "3. 利用 `loss.backward()` 计算模型梯度。\n",
    "4. 利用 `optimizer.step()` 更新权重。\n",
    "5. 利用 `optimizer.zero_grad()` 清空梯度。\n",
    "6. 重复 1-5 的操作。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "因此，使用 PyTorch 可以大大的简化我们的编程难度。我们只需要改变模型的形式、损失函数的形式、优化器的形式以及各个参数的值，就能够训练出不同的模型，进而解决不同的深度学习问题了。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 实验总结"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "本实验详细的阐述了如何使用 PyTorch 对模型进行求解。这个过程既可以适用于传统机器学习问题的求解，也可以适用于神经网络的模型求解。"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "<hr><div style=\"color: #999; font-size: 12px;\"><i class=\"fa fa-copyright\" aria-hidden=\"true\"> 本课程内容版权归蓝桥云课所有，禁止转载、下载及非法传播。</i></div>"
   ],
   "metadata": {}
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}